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Abstract 



We study the problem of scheduling a set of jobs with release dates, deadlines and 
processing requirements (or works), on parallel speed-scaled processors so as to mini- 
^ ■ mize the total energy consumption. We consider that both preemption and migration 

•^O , of jobs are allowed. An exact polynomial-time algorithm has been proposed for this 

problem, which is based on the Ellipsoid algorithm. Here, we formulate the problem as 
fS| ' a convex program and we propose a simpler polynomial-time combinatorial algorithm 

r~^ . which is based on a reduction to the maximum flow problem. Our algorithm runs in 

^> \ 0{nf{n)logP) time, where n is the number of jobs, P is the range of all possible values 

of processors' speeds divided by the desired accuracy and /(n) is the complexity of 
computing a maximum flow in a layered graph with 0{n) vertices. Independently, Al- 
bers et al. [3j proposed an 0(n^/(n))-time algorithm exploiting the same relation with 
p\ ' the maximum flow problem. We extend our algorithm to the multiprocessor speed scal- 

j^ ■ ing problem with migration where the objective is the minimization of the makespan 

under a budget of energy. 

1 Introduction 

Energy consumption is a major issue in our days. Great efforts are devoted to the reduc- 
tion of energy dissipation in computing environments ranging from small portable devices to 
large data centers. From an algorithmic point of view, new challenging optimization prob- 
lems are studied, in which the energy consumption is taken into account as a constraint or 
as the optimization goal itself (for recent reviews see [HE]). This later approach has been 
adopted in the seminal paper of Yao et al. [15], where a set of independent jobs with release 
dates and deadlines have to be scheduled on a single processor so that the total energy is 
minimized, under the so-called speed-scaling model where the processor may run at variable 



speeds. Under this model, if the speed of a processor is s then the power consumption is 
s", where a > 1 is a constant, and the energy consumption is the power integrated over time. 

Single processor case. Yao et al. proposed in [15J, an optimal off-line algorithm, known as 
the YDS algorithm according to the initials of the authors, for the problem with preemption, 
i.e. where the execution of a job may be interrupted and resumed later on. In the same 
work, they initiated the study of online algorithms for the problem, introducing the Average 
Rate (AVR) and the Optimal Available (OA) algorithms. Bansal et al. [^ proposed a new 
online algorithm, the BKP algorithm according to the authors' initials, which improves the 
competitive ratio of OA for large values of a. 

Multiprocessor case. There are two variants of the model: the first variant allows the pre- 
emption of the jobs but not their migration. We call this variant, the non-migratory variant. 
This means that a job may be interrupted and resumed later on, on the same processor, but 
it is not allowed to continue its execution on a different processor. In the second variant, the 
migratory variant, both the preemption and the migration of the jobs are allowed. In 0, 
Albers et al. considered the non-migratory problem of minimizing the total energy consump- 
tion given that the jobs have release dates and deadlines. For unit-work jobs, they proposed 
a polynomial time algorithm when the deadlines of jobs are agreeable. When the release 
dates and deadlines of jobs are arbitrary, they proved that the problem becomes NP-hard 
even for unit-size jobs and proposed approximation algorithms with constant approximation 
ratios for the off-line version of the problem. A generic reduction is given by Greiner et al. 
(see [H]) transforming a /3-approximation algorithm for the single-processor problem to a 
/3i?Q,-approximation algorithm for the multi-processor non-migratory problem, where B^ is 
the a-th Bell number. Also, they showed that a /3-approximation for multiple processors 
with migration yields a deterministic /35a-approximation algorithm for multiple processors 
without migration. 

For the migratory variant, Chen et al., in (TU], were the first to study the speed scaling 
problem of minimizing the energy consumption on m processors with migration. In fact, 
they proposed a simple algorithm for the case where jobs have common release dates and 
deadlines. In [8], Bingham and Greenstreet proposed a polynomial-time algorithm for the 
general problem where each job has an arbitrary work, a release date and a deadline, and the 
power function is any convex function. Their algorithm is based on the use of the Ellipsoid 
method (see [13] )• Since the Ellipsoid algorithm is not used in practice, it was an open 
problem to define a faster combinatorial algorithm. When preparing the current version of 
this paper, it came to our knowledge that Albers et al. [3J considered the same problem and 
presented an optimal 0{n'^f{n))-tiine combinatorial algorithm, where n is the number of jobs 
and f{n) the complexity of finding a maximum flow in a layered graph with 0{n) vertices. 
Notice that in [3] , nothing is mentioned about the exact complexity of the algorithm, except 
of course of its clear polynomiality. They also extended the analysis of the single processor 
OA and AVR online algorithms to the multiprocessor case with migration. 

Multicriteria minimization. In general, minimizing the energy consumption is in con- 
flict with the increase of the performance of many computing devices. Hence, a series of 
papers adresses this problem in a multicriteria context. In [2], Pruhs et al. were the first to 



study the problem of optimizing a time-related objective function with a budget of energy. 
Their objective was to minimize the sum of flow times and they presented a polynomial 
time algorithm for the case of unit-work jobs. To prove that their algorithm is optimal, 
they formulated the problem as a convex program and they applied the well-known Karush- 
Kuhn- Tucker (KKT) conditions to get necessary conditions for optimality. In |1], Albers 
and Fujiwara studied the problem of minimizing the sum of flow times plus energy instead 
of having an energy budget, which gives rise to an alternative way of combining the opti- 
mization of two conflicting criteria. For unit-work jobs, they proposed online algorithms and 
an exact polynomial-time algorithm. In [9], Chan et al. proposed an online algorithm to 
minimize the energy consumption and among the schedules with the minimum energy they 
tried to find the one with the maximum throughput. Assuming that there is an upper bound 
on the processor's speed, they established constant-factor competitiveness both in terms of 
energy and throughput. 

Our contribution and organization of the paper. We consider the multiprocessor 
migratory scheduling problem with the objective of minimizing the energy consumption. In 
Section 3, we give the first convex programming formulation of the problem and in Section 4, 
we apply, for the first time, the well known KKT conditions. In this way, we obtain a set of 
properties that need to be satisfied by any optimal schedule. Then in Section 5, we propose 
an optimal algorithm in the case where the jobs have release dates, deadlines and the power 
function is of the form s". The time complexity of our algorithm, which we call BAL, is 
in O {n f{n) log P), where n is the number of jobs, P is the range of all possible values of 
processors' speed divided by the desired accuracy and f{n) is the complexity of computing 
a maximum flow in a layered graph with 0{n) vertices. We also give a brief description of 
the relation of our algorithm and the one of Albers et al. |3] , as well as the analysis of their 
algorithm's complexity. Finally in Section 6, we extend BAL to obtain an optimal algorithm 
for the problem of makespan minimization with a budget of energy. 

2 Preliminaries 

Let J' = {ji, ...,jn} be a set of jobs. Each job ji is specified by a work Wi, a release date 
r, and a deadline di. We define sparii = [ri,di] and we say that ji is alive at time t if 
t G sparii. We also define the density of job jj as derii = Wi/{di — Tj). We assume a set 
of m variable-speed homogeneous processors in the sense that they can all, dynamically, 
change their speeds and have a common speed-to-power function P{t) = s{t)°' where P{t) 
is the power consumption at time t, s(t) is the speed (or frequency) at time t and a > 1 
is a constant. Consider any interval of time [a,b] and a given processor. The amount of 
work processed by this processor and its energy consumption during [a, b] are J^ s{t)dt and 

f s{t)°'dt, respectively. Hence, if a job is continuously run at a constant speed s during an 
interval of length i, then w = s ■ i units of work are completed and an amount oi E = 3°" ■ i 
units of energy are consumed. In our setting, preemption and migration of jobs are allowed. 
That is, the processing of a job may be suspended and resumed later on the same processor 
or on a different one. Nevertheless, we do not allow parallel execution of a job which means 
that a job cannot be run simultaneously on two or more processors. We also assume that a 



continuous spectrum of speeds is available and that there is no upper bound on the speed of 
any processor. Our objective is to find a feasible schedule that minimizes the total energy 
consumed by all processors. 

We define T = {to, ■ ■ ■ ti,} to be the set of release dates and deadlines taken in a non- 
decreasing order and without duplication. It is clear that to = ^^jt^ji^i} and ti = 
m.a.Xj^^j{di}. Let Ij = [tj-i,tj], for I < j < L, and X = {/i,--- ,/l}. We denote \Ij\ 
the length of the interval Ij. Also, let A{j) be the set of jobs that are alive during Ij, i.e. 
all the jobs ji with Ij C spani, and aj = \A{j)\ be the number of jobs in A{j). Given any 
schedule S, we denote tij the total units of time that job ji is processed during the interval 
Ij by S. As already mentioned in many other works (see [15] for example), one can show, 
through a simple exchange argument, that there always exists an optimal schedule in which 
every job ji is run at a constant speed s, and this comes from the convexity of the power 
function. 

Next, we state a problem which is a variation of our problem that we will need throughout 
our analysis, we call it the Work Assignment Problem (or WAP) and can be described as fol- 
lows: Consider a set of n jobs J' = {ji,J2, ■ ■ ■ , jn} and a set of intervals X = {/i, h,- ■ ■ , II}- 
Each job can be alive in one or more intervals in X. During each interval Ij there are rrij 
available processors. Moreover, we are given a value v. Our objective is to find whether or 
not there is a feasible schedule that executes all jobs in J' with constant speed v. Recall 
that a schedule is feasible if and only if each job is executed during its alive intervals and 
is executed by at most one processor at each time t. Preemption and migration of jobs are 
allowed. Note that the WAP is almost the P\ri, di,pmtn\— (see [7]) with the difference that, 
in WAP, not all intervals have the same number of available processors. Therefore, WAP is 
polynomially solvable by applying a variant of an algorithm for P|rj, di,pmtn\—. 

3 Convex Programming Formulation 

Our problem can be formulated as the following convex program: 
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Note that the total running time and the total energy consumption of each job ji is — and 
WjS^"^, respectively. Then, the term (1) is the total energy consumed by all jobs which is our 



objective function and the constraints (2) enforce that Wi amount of work must be executed 
for each job jj. The constraints (3) enforce that we can use at most m processors for \Ij\ 
units of time during any interval Ij. Also, we can use at most aj processors operating for 
\Ij\ units of time during any interval Ij, otherwise we would have parallel execution of a job 
and this is expressed by (4). The constraints (5) prevent any job ji from being executed for 
more than \Ij\ units of time during any interval Ij C sparii. Note that constraints (4) and 
(5) are both needed and none is covered by the other. The constraints (6) and (7) insure 
the positivity of the variables tjj- and Sj, respectively. 

The above mathematical program is indeed convex because, as mentioned by other works 
(e.g. [H]), the objective function and the first constraint are convex while all the other 
constraints are linear. Since our problem can be written as a convex program, it can be solved 
in polynomial time by applying the Ellipsoid Algorithm [13]. Nevertheless, the Ellipsoid 
Algorithm is not used in practice and we would like to construct a faster and less complicated 
combinatorial algorithm. 

At this point, notice that once the speeds of the jobs are computed, by solving the convex 
program, a further step is needed in order to construct a feasible schedule. This is exactly 
the feasibility problem P\ri,di,pmtn\ — . 

4 KKT Conditions 

We apply the KKT conditions to the above convex program to obtain necessary conditions 
for optimality of a feasible schedule. We also show that these conditions are sufficient for 
optimality. 

Assume that we are given the following convex program: 

min/(x) 

gi{x) < 1 < i < m 

Suppose that the program is strictly feasible, i.e. there is a point x such that gi{x) < for 
all 1 < i < m, and all functions Qi are different iable. Let Aj be the dual variable associated 
with the constraint gi{x) < 0. The Karush-Kuhn- Tucker (KKT) conditions are: 

gi{x) < 1 < i < m 

Ai > l<i<m 

KOiix) =0 1 < i < m 



V/(x) + X]A.V(?.(x)=0 



KKT conditions are necessary and sufficient for solutions x G R" and A G R™ to be 
primal and dual optimal. We refer to the above conditions as primal feasible, dual feasible, 
complementary slackness and stationarity conditions, respectively. 



The following lemma is a direct consequence of the KKT conditions for the convex pro- 
gram of our problem. 

Lemma 1 A feasible schedule for our problem is optimal if and only if it satisfies the fol- 
lowing properties: 

1. Each job ji is executed at a constant speed Sj. 

2. If a job ji is not executed during an interval Ij C sparii, i.e. tij = 0, then Si < Sk for 
every job j^ with Ij C sparik and t^j > 0. 

3. If a job ji has tij = \Ij\ for an interval Ij, then Si > s^ for any job j^ alive during Ij 
with tkj < \Ij\. 

4. All jobs ji that are alive during Ij with < tij < \Ij\ have equal speeds. 

5. If aj < m during an interval Ij, then tij = \Ij\, for every ji with Ij C sparii. 

Proof: 

In order to apply the KKT conditions, we need to associate with each constraint a dual 
variable. Therefore, to each set of constraints from (2) up to (7) we associate the dual 
variables /3j, 7^, 6j, eij, Qj and rji, respectively. 
By stationarity conditions, we have that 

L L 

+ 5Z IZ e.,V(t„--|J,|) + 5^ Y. 0,V(-t„-) + 5^ r7,V(-s.) = 
i=i j^<^A{j) j=i heAU) jz€j 

The above equation can be rewritten equivalently as 
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Furthermore, complementary slackness conditions imply that 



A-f^- J2 ^m)=o J^eJ (9) 

ly{ Y. ^^'1 - ^ ■ l^jl) = 1<J<L (10) 

Sj-( J2 kj-^r\lj\) =0 i<j<L (11) 

e^J ■ (t„- - |/,|) = 1<J<L, j,e A{j) (12) 

0, ■ i-kj) = 1 < J < i:, J^ e A(j) (13) 

Vi ■ i-Si) = j,ej (14) 

We can safely assume that there are no jobs with zero work because we may treat such jobs 
as if they did not exist. So, for any job ji it holds that Sj > and ^/cspani ^«j ^ ^- Then, 
(14) implies that r/j = 0. We set the coefficients of the partial derivatives Vsj and Vtij equal 
to zero so as to satisfy the stationarity conditions. Thus, (8) gives that /3j = (a — l)s" for 
each job ji G JT" and 

{a-l)Si =-fj + 5j + eij - Ci,j (15) 

for each ji G JT" and Ij C sparii. Now, for each interval Ij we have the following cases: 

Case 1: Qj > m 
In this case, it is obvious that all processors operate during the whole interval in any optimal 
schedule. Because of (11), Sj = 0. We consider the following subcases on the execution time 
of any job ji G A{j): 

1. Subcase A: < tij < \Ij\ 

Stationarity conditions (12), (13) imply that tij = Qj = 0. As a result, (15) can be 
written as 

(a - l)s^ = 7^- 

The variable 7^ is specific for each interval and as a result, all jobs of this subcase 
have the same speed throughout the whole schedule. We denote this speed Vj for each 
interval Ij. 

2. Subcase B: tij = \Ij\ 

In this case, by (13) and (15), we get that 

(a - l)st = 7, + Q,, (16) 

Hence, all jobs of this kind have Sj > Vj. 

3. Subcase C: tij = 

Which means, by (12) and (15) that 

{a - l)st = 7, - Q,j (17) 

and thus, Si < Vj. 



Case 2: Oj < m 

In this case, each job in A{j) is executed throughout the whole interval Ij, in every optimal 
schedule. This argument comes from the convexity of speed to power function. Therefore, 
each job ji G A{j) has Qj = 0. Moreover since fewer than m processors are used we have 
that 7j = 0. That is, for each ji G Ij we have (a — l)s" = 6j + eij. By this set of equations, 
we cannot establish any strong relation between the speed of the jobs that are alive during 
an interval Ij. 

Case 3: Oj = m 
This case can be handled exactly as the previous one with the difference that 7j > and 
thus, we get that (a — l)s° = 7j + 6j + eij. 

Given a solution of the convex program that satisfies the KKT conditions, we derived 
some relations between the primal variables. Based on them, we defined some structural 
properties of any optimal schedule. These properties are necessary for optimality and we 
show that they are also sufficient because any schedule that satisfies these properties is 
optimal. 

Assume for the sake of contradiction that there is a schedule A, that satisfies the prop- 
erties of lemma 1, which is not optimal and let B be an optimal schedule. We denote E^ , 
sf and tfj the energy consumption, the speed of job ji and the total execution time of job 
ii during the interval Ij in schedule X, respectively. Then, E^ > E^ . Let S be the set of 
jobs ji with sf > sf. Clearly, there is at least one job jk such that sf > sf , otherwise A 
would not consume more energy than B. So, 5* 7^ 0. By definition of S, 



E E ^6<E E ^' 



k^s if.j,eA(j) j,e5 ij-.j^eAij) 



Hence, there is at least one interval Ip such that 

ji&s jies 

This gives that tf^ < tf^ for some job j^ G S. Thus, tfp < \Ip\ and tf^ > 0. If we consider 
any interval Ij, the sum of processing times of all jobs in Ij is the same for all schedules 
satisfying lemma 1. So, there must be a job ji ^ S such that tf^ > tfp. Therefore, tf^ > 
and tfp < \Ip\. We conclude that sf > sf > sf > sf , which contradicts the fact that ji ^ S. 



Notice that Lemma [T] does not explain how to find an optimal schedule. The basic reason 
is that it does not determine the speed value of each job. Moreover, it does not specify exactly 
the structure of the optimal schedule. That is, it does not specify which job is executed by 
each processor at each time t. 



5 An Optimal Combinatorial Algorithm 

In this section, we propose a combinatorial algorithm for our problem which always constructs 
a schedule satisfying the properties stated in the previous section. Our algorithm is based 
on the notion of critical jobs defined below. The basic idea is to continuously decrease the 
speeds of jobs step by step. At each step, we assign a speed to the critical jobs that we 
ignore in the subsequent steps and we continue with the remaining subset of jobs. At the 
end of the last step, every job has been assigned a speed. In order to recognize the critical 
jobs, we consider a reduction to the Work Assignment Problem (WAP). 

Let us first give some notations and definitions concerning the maximum flow and min- 
imum cut problems. Consider a graph G = {V, E) in which each edge {u, v) has capacity 
c{u, v) and two nodes s,t E V. An {s, t)-cut of G is a partition of its nodes into two disjoint 
subsets X and Y so that if we remove the edges {u, w) with u E X and w eY, the nodes s 
and t are disconnected, i.e. there is no path from s tot. A minimum (s, t)-cut {X, Y) is a cut 
whose sum of the capacities of the edges {u, w) with u E X and w eY is minimized. In the 
following, we will consider an (s, t)-cut as the set of these edges. Also, given an (s, t)-flow of 
a graph G = {V, E), we use the term /(e) to denote the amount of flow that passes through 
the edge e E E. 

Given a graph G and a flow J-", we deflne the residual graph Gf of G with respect to 
J-' as follows: (i) Gf has the same set of nodes with G, (ii) for each edge {u, v) in G on 
which f{u,v) < c{u,v), we include the edge {u,v) with capacity c{u,v) — f{u,v), and (iii) 
for each edge {u,v) with f{u,v) > 0, we include the edge {v,u) with capacity f{u,v). Next, 
we deflne the notion of upstream nodes that we will need throughout our analysis. A node 
V is upstream if, for all min (s,t)-cuts {X,Y), v belongs in X. That is, v lies on the source 
side of every min cut. 

Now, for each instance of the WAP, we deflne a graph so as to reduce our original problem 
to the maximum flow problem. Given an instance < J',X,v > of the WAP, consider the 
graph G = {V,E) that contains one node Xi for each job ji, one node yj for each interval 
Ij, a source node s and a destination node t. We introduce an edge {s,Xi) for each ji E J 
with capacity ^, an edge {xi,yj) with capacity \Ij\ for each couple of jj and Ij such that 
ji E A{j) and an edge {yj,t) with capacity mj\Ij\ for each interval Ij E X. We say that this 
is the corresponding graph oi < J',X,v >. 

At this point, we are ready to introduce the notion of criticality. Given a feasible instance 
for the WAP, we say that job ji is critical if and only if for any feasible schedule and for 
each Ij C spaui, either tjj = \Ij\ or XlieAfv)^*^ ~ ""^il-^il- Furthermore, we say that an 
instance < J',I,v > of the WAP is critical if and only if v is the minimum speed so that 
the set of jobs J' can be feasibly executed over the intervals in X. With respect to graph G, 
a job ji is critical if and only if for any maximum flow, either the edge {xi,yj) or the edge 
{yj,t) is saturated for each Ij such that ji E A{j). Notice that job ji is also critical for the 
< J',X,v — e >, ioT any e > 0. 



5.1 Properties of the Work Assignment Problem 

Next, we will prove some lemmas that will guide us to an optimal algorithm. Our algorithm 
will be based on a reduction of our problem to the maximum flow problem which is a 
consequence of the following lemma. 

Lemma 2 ^ There exists a feasible schedule for the work assignment problem if and only 
if the corresponding graph has maximum {s,t)-fiow equal to Y17=i ^■ 

At this point, we state a Lemma concerning the upstream nodes that we will need in 
one of the proofs that follow. Also, for completeness, we present a proof that can be also be 
found in [12]. 



Claim 1 fW^ The set of upstream nodes is reachable from the source node s in the residual 
graph of any maximum flow and therefore they can be found by performing a breadth-first- 
search (BFS) starting from s. 

Proof: 

Let (X, Y) the cut found after performing a BFS on the residual graph Gf, starting from the 
source s, at the end of any maximum flow algorithm. If a node v is upstream then it must 
belong to X. Conversely, assume that v & X and v is not an upstream node. This means that 
there is a cut (X', Y') with v G Y' . Given that f G X, there is a path P from s to v. Since 
V G y , P must have an edge (m, w) with u in X' and w G Y' . However this is a contradiction 
since there is an edge in Gf that goes from the source side to the sink side of a minimum cut. ■ 

The following lemmas that involve the notions of critical job and critical instance are 
important ingredients for the analysis of our algorithm. 

Lemma 3 If < J',X,v > is a critical instance of WAP, then there is at least one critical 
job ji G J. 

Proof: 

Let G be the graph that corresponds to a critical instance < J ,X,v >, and let G' be 
the graph that corresponds to the instance < J ,X,v — e >, for a small constant e > 
that approaches zero. Since < J,X,v > is critical, there is no feasible (s,t)-fiow equal to 
J2jiej '^ i^ ^'- Because of the max flow — min cut theorem, we can conclude that any 
minimum (s,t)-cut of G' has capacity strictly less than 'Ylij^^j '^ ^^^ as a result, there is 
no minimum (s,t)-cut of G' that includes all edges (s,Xi). If all edges (s,Xi) were included 
in a minimum (s, t)-cut, then G' would have an (s, t)-fiow in which all these edges would be 
saturated which implies that there would be a feasible (s, t)-fiow for G' with value ^j^gj- ^• 
The remainder of the proof is based on the notion of upstream nodes. For that, it suffices 
to observe that given any maximum flow, there is always an edge (s, Xi) that is not saturated. 
Firstly, we need to show that there is always an Xi node in G' which belongs to the set of 
upstream nodes. If we apply breadth-first search on the residual graph Gf, we will reach 
Xi which implies that Xi is upstream. Thus, for every path Xi,yj,t of G', there is always 
an edge {xi,yj) or {yj,t) that is saturated by any maximum fiow. This holds since if not, 
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there would be an unsaturated (s, t) path (a path is saturated if at least one of its edges is 
saturated) contradicting the maximality of the flow. Hence, ji, the job that corresponds to 
Xi, is a critical job. ■ 



Lemma 4 Let G = {V,E) be the graph that corresponds to the instance < J',I,v > of the 
WAP. If the edge {yj,t) G E belongs to a minimum {s,t)-cut of G and there is a maximum 
{s,t)-flow such that f{xi,yj) > 0, then ji is critical. 

Proof: 

Suppose that the edge {yj,t) belongs to a minimum (s,t)-cut C and that there is a max- 
imum (s,t)-flow J-' such that f{xi,yj) > 0. C is saturated by any maximum flow. Since 
f{xi,yj) > 0, it is not possible that a path from Xi to t is left unsaturated by J-' because if 
this was the case, then we could send part of f{xi, yj) through the unsaturated path and this 
would contradict the fact that {yj, t) belongs to a minimum (s, t)-cut. Since J-" is a maximum 
(s, t)-fiow and saturates all the paths from Xi to t, there should be a minimum (s, t)-cut C 
that contains one edge from each such path (the one that is saturated by J-'). Hence, ji is 
critical. ■ 



Our algorithm is based on the following lemma in order to determine critical jobs. 

Lemma 5 Assume that < J',I,v > is a critical instance for WAP and let G' be the graph 
that corresponds to the instance < J',I,v — e >. Then, any minimum {s,t)-cut C of G' 
contains: 

(i) exactly one edge of every path Xi, yj, t for any critical job ji of G, 

(a) all the {s,Xi) edges for any non- critical job ji of G. 

Proof: 

Consider any critical job ji. Assume that there is a path Xi, yj, t in G' such that none of its 
edges belong to a minimum (s,t)-cut C. Then there is a maximum (s,t)-flow J-" that does 
not saturate the edges {xi, yj) and {yj, t). If the edge {s, Xi) was not saturated, then J^ would 
not be a maximum flow. On the other hand, if (s, x,i) was saturated by J-", then job ji would 
not be critical for < J ,X,v >. In both cases, we have a contradiction. 

Similarly, assume that ji is not critical for the instance < J ,X,v > and suppose that the 
edge {s,Xi) does not belong to a minimum cut of G'. This means that there is a maximum 
{s, t)-fiow J^ that does not saturate this edge. If there is at least one path Xi, yj, t that is not 
saturated, then J-" is not maximum and if all paths are saturated then ji is a critical job for 
< J,X,v >, which is a contradiction. ■ 
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5.2 The BAL Algorithm 

We are now ready to give a high level description of our algorithm. Initially, we will assume 
that the optimal schedule consumes an unbounded amount of energy and we assume that all 
jobs are executed with the same speed sub- This speed is such that there exists a feasible 
schedule that executes all jobs with the same speed. Then, we decrease the speed of all jobs 
up to a point where no further reduction is possible so as to obtain a feasible schedule. At 
this point, all jobs are assumed to be executed with the same speed, which is critical, and 
there is at least one job that cannot be executed with speed less than this. The jobs that 
cannot be executed with speed less than the critical one form the current set of critical jobs. 
So, the critical job(s) is (are) assigned the critical speed and is (are) ignored after this point. 
That is, in what follows, the algorithm considers the subproblem in which some jobs are 
omitted (critical jobs), because they are already assigned the lowest speed possible (critical 
speed) so that they can be feasibly executed, and there are less than m processors during 
some intervals because these processors are dedicated to the omitted jobs (i.e. we get an 
instance of WAP). Our algorithm can be described as follows: 

Algorithm 1 BAL 

1: Sub = max{maxj{— ^^^^^^^-^},maxj^{cfenj}}, Slb = maxj^gjjfieni} 

2: while J^ 7^ do 

3: Find the minimum speed Scrit so that the instance < J', X, Scrit > of the WAP problem 

is feasible, using binary search in the interval [slb,sub], through repeated maximum 

flow computations. 
4: Determine the set of critical jobs J^cru- 
5: Assign to the critical jobs speed Scrit and set J' = J\Jcrii- 
6: Update X, i.e., the number of available processors m^ for each interval Ij. 
7: Sub = Scrit, Slb = majij^(,j{deni} 
8: Use the optimal algorithm for P|rj, di,pmtn\— to schedule each job with processing time 

Wi/Si. 

We denote Scrit the critical speed and J^crit the set of critical jobs. We know that each 
job will be executed with speed not less than its density. Therefore, given a set of jobs J', 
we know that there does not exist a feasible schedule that executes all jobs with the speed s < 

maxj^gj-jfienj}. Also, observe that no job has speed s > max{maxj{ — ^^-rjx^ — -}, maxj.{c/e?2j}}. 
These bounds define the search space of the binary search for the first step of the algorithm 
in order to determine the minimum speed for which there is a feasible schedule that executes 
all jobs in J' with the same speed. In the subsequent step the current speed (i.e. the critical 
speed of the previous step) is an upper bound on the speed of all remaining jobs and a lower 
bound is the maximum density among them. We use these updated bounds to perform a new 
binary search and we go on like that. At this point, note that binary search has already been 
used in other works as part of optimal polynomial-time algorithms for scheduling problems 
with speed scahng (see |4] and [H]). 

In order to complete the description of our algorithm, it remains to explain the way 
critical jobs are determined. Because of Lemma 5, this can be done by finding a minimum 
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(s, t)-cut in the graph G' that corresponds to < J ,X^v — t> where J and X correspond to 
the current instance of the WAP. Note that e must be such that f — e is strictly greater than 
the next critical speed. 

Algorithm BAL produces an optimal schedule, and this holds because any schedule con- 
structed by the algorithm satisfies the properties of Lemma [H 

Theorem 1 Algorithm BAL produces an optimal schedule. 

Proof: 

First of all, it is obvious that the algorithm assigns to every job a constant speed because 
each job is assigned exactly one speed in one iteration. Because of Lemma HI we know that 
all jobs that have < tj j < \Ij\ will have the same speed because when such a job is critical 
all other jobs of the same kind are critical as well and are assigned the same speed. For the 
same reason, each job with t,,,- = \Ij\ will be assigned the same speed with all jobs that will 
run during Ij or a greater one in a previous step. 

Now, consider the case where Sj = for a job ji during an interval Ij C sparii. When ji 
is assigned a speed by the algorithm, it is critical. Hence, in every interval Ij such that ji is 
alive, apart from the ones whose processors were already occupied in previous iterations, we 
know that either tjj = \Ij\ or Yliji£A(j) ^hJ ~ ''^il-^il' where rrij is the number of the available 
processors. Therefore, if tij = then there are two cases: either (i) Ij had all its processors 
occupied in a previous iteration than the one that ji was assigned a speed, or (ii) this hap- 
pened at the same iteration and the minimum speed that a job has during this interval is 
not less than the one oi ji. Hence, ji cannot get greater speed than any job executed during 
Ij. Finally, because of Lemma 5, BAL correctly identifies the critical jobs at each step of 
the algorithm. The theorem follows. ■ 

We turn, now, our attention to the complexity of the algorithm. Because of Lemma [3] at 
least one job (all critical ones) is scheduled at each step of the algorithm. Therefore, there 
will be at most n steps. Assume that P is the range of all possible values of speeds divided by 
our desired accuracy. Then binary search, needs O(logP) values of speed to determine the 
next critical speed at one step. That is, BAL performs O(logP) maximum fiow calculations 
at each step. Thus, the overall complexity of our algorithm is 0{nf{n) logP). 

Relation of BAL with the algorithm of Albers et al. [3]. The high-level idea of 
the algorithm in [3j is similar with the one of BAL. Both algorithms can be decomposed in 
a number of steps (phases) and at each step, a subset of jobs (the critical ones) is scheduled. 
The difference between the two algorithms is the way each step is performed. In [3] , a step is 
as follows: at the beginning, all remaining jobs are conjectured to be critical. Then, the set 
of (potential) critical jobs is reduced through repeated maximum flow computations. Once 
the set of critical jobs of a particular step is determined, their algorithm specifles the way 
these jobs are executed. In the worst case, their algorithm performs n steps and the i-th 
step involves n — i maximum flow computations. Therefore, the worst-case running time of 
their algorithm is 0{n'^f{n)). In our case, BAL computes the speed of critical jobs through 
binary search. Each iteration of the binary search involves a maximum flow computation. 
Once the critical speed is computed, the set of critical jobs can be found by computing a 
minimum-cut. BAL constructs the schedule once all the critical speeds are determined. 
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6 Makespan Minimization with a Budget of Energy 

Algorithm BAL can be extended to obtain an optimal algorithm, say MBAL, for the problem 
of makespan minimization given a fixed budget of energy E. As before, preemption and 
migration are allowed and jobs have arbitrary release dates and works. In order to apply 
MBAL, we will need an upper and a lower bound on the makespan of the optimal schedule. 
Then, the algorithm uses binary search to compute the minimum makespan for which there 
is a feasible schedule consuming E units of energy. Two such bounds are Xlb = ~(^)°^^ 
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and XuB = niaxjjrj} + (^^)«-i where W is the total work of all jobs. The high-level 
description of the algorithm is the following: 

Algorithm 2 MBAL 

1: Compute XuB and Xlb- 

2: Perform binary search in [X^b, Xjjb] to find the minimum makespan X* for which there 

is a feasible schedule that consumes an E amount of energy. 
3: Return this schedule. 

In order to perform the binary search, given a value X, MBAL examines whether or not 
there is a feasible schedule of makespan X that consumes E units of energy. To do this, it 
runs algorithm BAL assuming that all jobs have a common deadline X. Then, it computes 
the minimum value of energy E* that a feasible schedule for the particular instance might 
have. If ii^ > E*, then there is a feasible schedule that executes the jobs using no more 
than E energy with makespan X. Otherwise, there does not exist such a schedule. The 
complexity of MBAL is logP times the complexity of BAL, i.e. 0{nf{n) log P). 

7 Conclusion 

We studied the energy minimization multiprocessor speed scaling problem with migration. 
We proposed a combinatorial polynomial time algorithm based on a reduction to the maxi- 
mum fiow problem. We also extended our result in the case where the objective is makespan 
minimization given a budget of energy. Since there is not much work on problems with 
migration there are many directions and problems to be considered for multicriteria opti- 
mization. All these problems seem to be very interesting and might require new algorithmic 
techniques because of their continuous nature. In this context, we believe that the approach 
used in our paper may be useful for future works. 
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